Mining High Average-Utility Itemsets with an Indexed Projection Technique
نویسندگان
چکیده
An itemset in traditional utility mining only considers individual profits and quantities of items in transactions but not its itemset length. The average-utility measure, which is the total utility of an itemset divided by its number of items within it, was then proposed to reveal a better utility effect than the original utility one. However, their proposed approach was based on the principle of level-wise processing to find high average-utility itemsets from a database. In this paper, we thus propose an efficient average-utility mining approach which adopts a projection technique and an indexing mechanism to speed up the execution and reduce the memory requirement in the mining process. Finally, the experimental results on a real dataset show the superior performance of the proposed
منابع مشابه
A New Algorithm for High Average-utility Itemset Mining
High utility itemset mining (HUIM) is a new emerging field in data mining which has gained growing interest due to its various applications. The goal of this problem is to discover all itemsets whose utility exceeds minimum threshold. The basic HUIM problem does not consider length of itemsets in its utility measurement and utility values tend to become higher for itemsets containing more items...
متن کاملEfficient Mining of High Utility Itemsets from Large Datasets
High utility itemsets mining extends frequent pattern mining to discover itemsets in a transaction database with utility values above a given threshold. However, mining high utility itemsets presents a greater challenge than frequent itemset mining, since high utility itemsets lack the anti-monotone property of frequent itemsets. Transaction Weighted Utility (TWU) proposed recently by researche...
متن کاملTightening Upper Bounds of Utility Values in Utility Mining
Utility mining in data mining has recently been an emerging research issue due to its practical applications. In this paper, with the concept of projection technique, we propose an efficient algorithm for finding high utility itemsets in databases. In particular, an improved upper-bound strategy in the proposed algorithm is designed to further tighten the upper bounds of the utility values for ...
متن کاملData sanitization in association rule mining based on impact factor
Data sanitization is a process that is used to promote the sharing of transactional databases among organizations and businesses, it alleviates concerns for individuals and organizations regarding the disclosure of sensitive patterns. It transforms the source database into a released database so that counterparts cannot discover the sensitive patterns and so data confidentiality is preserved ag...
متن کاملA Bottom-Up Projection Based Algorithm for Mining High Utility Itemsets
Mining High Utility Itemsets from a transaction database is to find itemsests that have utility above a user-specified threshold. This problem is an extension of Frequent Itemset Mining, which discovers itemsets that occur frequently (i.e. with occurrence count larger than a user given value). The problem of finding High Utility Itemsets is challenging, because the anti-monotone property so use...
متن کامل